DISCRETE GRADIENT METHODS FOR PRESERVING A FIRST 
INTEGRAL OF AN ORDINARY DIFFERENTIAL EQUATION 



Richard A. Norton and G. R. W. Quispel 

Department of Mathematics and Statistics 
La Trobe University 
Melbourne, Victoria 3086, Australia 

This paper is dedicated to Arieh Iserles, a dear friend and a wonderful colleague. 

Abstract. In this paper we consider discrete gradient methods for approxi- 
mating the solution and preserving a first integral (also called a constant of 
motion) of autonomous ordinary differential equations. We prove under mild 
conditions for a large class of discrete gradient methods that the numerical 
solution exists and is locally unique, and that for arbitrary p G N we may con- 
struct a method that is of order p. In the proofs of these results we also show 
that the constants in the time step constraint and the error bounds may be 
chosen independently from the distance to critical points of the first integral. 

In the case when the first integral is quadratic, for arbitrary p £ N, we 
have devised a new method that is linearly implicit at each time step and of 
order p. This new method has significant advantages in terms of efficiency. We 
illustrate our theory with a numerical example. 

1. Introduction. Consider the autonomous ordinary differential equation (ODE) 

x = f(x) t > 0, (1) 

where x(t) G M. d for some d G N, x(0) = xq G R d is the initial condition and 
/ : K d — > IR d is locally Lipschitz continuous. Then given a bounded set B C M. d , 
there exists a T > such that for any xo G B the solution exists and remains 
bounded for t G [0, T] (see e.g. [5, Thm. 1.7.3 on p. 37]). We assume that this 
ODE has a conserved first integral (also called a constant of motion) I : M d — > R 
such that 

I(x(t)) = I(x ) for all t G [0, T], (2) 
To simplify notation define i(x) := V/(x) for all x G M. d , and assume that i(x) is 
locally Lipschitz continuous. Define R+ := {t G K : t > 0}. According to [9], on 
{x G M d : i(x) ^ 0} we may write (1) as 

x = S(x)i(x) (3) 

where S(x) G M. dxd is skew-symmetric (S T = —S) and may be given by the so-called 
default formula 

f(x) l (x) T - i (x)f(x) T 

s{x) ■ (4) 
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In general, the choice of S(x) satisfying f(x) — S(x)i(x) is not unique. Moreover, 
Proposition 2.1 in [9] states that if / £ C r (R d ;R d ) for r > 1 and / is a Morse 
function (i.e. smooth with non-degenerate critical points) then S in (4) is C r and 
locally bounded on {x £ R d : i(x) ^ 0}, and in the proof of Proposition 2.1 we 
also have that \f(x)\/\i(x)\ is locally bounded on {x £ K d : i(x) ^ 0}. In fact, the 
requirement that / £ C r (R d ;R d ) for r > 1 may be relaxed to / locally Lipschitz 
continuous so that S is also only locally Lipschitz continuous on {x £ M. d : i(x) ^ 0}. 

Let us also make the assumption that / is a Morse function so that for a bounded 
set B C K d there exists a constant C\ = C\ (B) such that 

\f(x)\<d\i{x)\ for all x £ B. (5) 

Note that from continuity it follows that if i{x) ~ for x £ B, then f(x) = 0. A 
useful constant throughout this paper will be Ci = C*2(i?) := C\ + \ . 

Methods for approximating the solution to this type of ODE that simultaneously 
preserve the integral are of interest in many applications. For example, Hamiltonian 
systems, Poisson systems, celestial mechanics, the Lotka-Volterra system and the 
undamped Duffing oscillator (see [4] and references therein). Here we consider 
discrete gradient methods for approximating the solution to (1) whilst exactly 1 
preserving / (see e.g. [9, 13, 15]). 

Let us first define a special type of discretization of the gradient of /, a discrete 
gradient of I. 

Definition 1.1. (Gonzalez [2]). A discrete gradient of /, denoted kR^xK^ R d , 
is continuous and satisfies 

i(x, x') ■ (x' — x) = I(x') — I(x) and i(x, x) = i(x) for all x, x' £ M. d . 

There are several ways of constructing a discrete gradient. Two notable examples 
are the one used in the averaged vector field method (called the mean value discrete 
gradient in [9], see also [14]) and the coordinate increment method [8]. 

Given a time step h we define a discrete gradient method by the map x n- x 

. [ x + hS(x, x' , h)i(x, x') if i(x) ^ 0, , . 

X = \ ■ff\n ( 6 > 

I x it i{x) = U, 

where i is a discrete gradient of / and S is any skew-symmetric consistent approxi- 
mation of S. By consistent we mean that S(x, x 1 ', h) is continuous and S{x, x, 0) = 
S(x) on {x £ M. d : i{x) ^ 0}. All discrete gradient methods preserve / because 

I(x') — I(x) — i(x, x') ■ (x' — x) — h(i(x, x')) T S(x, x , h)i(x, x') — 0. (7) 

The final equality in (7) is because S is skew-symmetric. 

By discretizing the default formula for S(x) given in (4) we obtain an example 
of a discrete gradient method (there are many different possible discrete gradient 
methods for (1)). Let i(x,x',h), i(x,x',h) and i(x,x',h) be consistent approxima- 
tions to i(x), so that they are all continuous and 

i(x, x, 0) = i(x, x, 0) = i(x, x, 0) = i(x) for all x £ M. d , 

let f(x,x',h) be a consistent approximation of f(x), and let i{x, x') be a discrete 
gradient of /. Then a discrete gradient method applied to (1) is defined by (6) with 



i.e. up to round-off error or a larger specified tolerance. 
2 It will be our convention to let x = x n (the approximate solution at step n) and x' = x r , 
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S(x, x', h) given by 



S(x, x', h) 



fjx, x', h)(i(x, x', h)) T - i(x, x' , h)(f(x, x' , h)) T 
i(x, x', h) ■ i{x, x', h) 



(8) 



A useful way of describing the accuracy of a numerical method for solving (1) 
is to determine its order of accuracy. For one-step methods this is defined by the 
truncation error around the point x for a time step h. 

Definition 1.2. A one-step method x i->- x' with time step h for solving (1) has 
order of accuracy p G N, if there exist positive constants C and H such that 



where x(-) denotes the solution to (f) with x(t) = x for some t £ R+ and B is 
a compact set in W 1 . The constants C and H may depend on B but should be 
independent of x and h. 

This definition (taken from [f , Def. V.f .3]) is more precise about the dependen- 
cies for the constants C and H than the definitions for order p given in other texts 
(e.g. [5, Def. II.1.2] and [4, Def. II.1.2]) where it is defined by \x' - x(t + h)\ = 
0(h p+1 ) as h — > 0. These other definitions are ambiguous regarding how the hidden 
constant in £>(•) may depend on other parameters, and how small h should be. By 
using the definition from [f ] in our results we can be sure that the constants in the 
definition of order p do not depend on |z(x)|, which may be small. 

Throughout this paper we will also make use of Banach's Fixed Point Theorem 
(also called the Contraction Principle), see e.g. [7, Thm. 3.1.2 on p. 74]. 

Theorem 1.3 (Banach's Fixed Point Theorem). Let (X,d) be a non-empty com- 
plete metric space. Let T : X —¥ X be a contraction on X, i.e. there exists a 
q G (0, f ) such that 



Then there exists a unique fixed point x* € X such that T(x*) = x* . Furthermore, 
the fixed point can be found by iteration, x n+l — T{x n ) for n = 0,1,2,... with 
x° G X arbitrary. 

In Section 2 we prove that discrete gradient methods where S has the form (8) 
are well-defined in the sense that provided h is sufficiently small and /, i, i, i and 
i satisfy certain consistency and local Lipschitz continuity conditions, then there 
exists a locally unique solution to (6). In Section 3 we prove that for arbitrarily 
chosen p G N, if /, i, i, i and i satisfy two additional conditions (/ defines a method 
of order p and \i-i — i-i\ is bounded in a special way) then we get a discrete gradient 
method of order p. 

In Section 4 we consider discrete gradient methods from the perspective of doing 
computations. Generally, each step of a discrete gradient method requires solving 
a nonlinear system of equations for x' and this may add a significant amount to the 
computational cost of the method because an iterative scheme, such as Newton's 
method, must be employed at each step. In the case when / is quadratic we present 
a new method that is linearly implicit in x' at each time step, so only a linear system 
of equations must be solved at each step. We also show that Runge-Kutta methods, 
under very mild conditions on the coefficients, give an / that satisfies the conditions 
required in Sections 2 and 3, and therefore we may use Runge-Kutta methods of 
order p (for some p G N) to construct discrete gradient methods of order p. 



\x' -x{t + h)\ < ChP 



for all h G [0, H] and all x G B, 



d(T(x),T(y))<qd(x,y) 



for all x, y G X. 
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Following the theory in these sections we present a numerical example in Section 
5 to illustrate our theory, and finally, in Section 6 we discuss the implications of 
this work and possible avenues for future research. 

2. Existence and uniqueness. At each time step of the discrete gradient method 
we must (in general) solve a nonlinear system of equations (see (6)) for a;', but does 
the solution to this system of equations exist? In this section we present a theorem 
that ensures for sufficiently small time step h, the map from x i— > x' is well-defined 
in the sense that there exists a locally unique solution x' to the system of equations 
(6) for the case when S is given by (8). 

Usual techniques for achieving this type of result include applying the Implicit 
Function Theorem (see [12]) or the Newton-Kantorovich Theorem (see [11]). For 
example, the Newton-Kantorovich Theorem is used in [3] to obtain existence of 
a numerical solution for a symmetric projection method, which requires solving a 
nonlinear system of equations at each time step. In our experience these approaches 
for discrete gradient methods lead to a condition on the time step such as h < 
C|i(x)| r for some positive constants C and r. If we are close to a critical point of / 
(i.e. when \i{x)\ is small) then the theory implies that we must also take h small. 
Our result and its proof below avoid this issue and we show that a solution to the 
nonlinear system of equations for a discrete gradient method ((6) with S defined 
by (8)) exists and is locally unique for a sufficiently small time step independent of 
\i(x)\ (and hence independent of the distance to critical points of I). 

The local nature of our result (everything will depend on an initially chosen 
bounded set B) is only due to the local Lipschitz continuity of / and i, and (5), 
rather than also depending on the distance to critical points of /. 

We will require the following definition of a ball around a point x 6 M. d . Given a 
constant R > and icR 1 ' define 



Note that if i(x) — 0, then Br{x) = {x}. 

The following theorem ensures that, for sufficiently small h and under certain 
local Lipschitz conditions, the map x h-> x' defined by (6) and (8) is locally well- 
defined, in the sense that there exists a locally unique solution to the nonlinear 
system of equations defined by (6) and (8). 

Theorem 2.1. Let B be a bounded set in R d and suppose there exist positive con- 
stants R, L and H such that for each x £ B and all u,v,w € Br(x) and h G [0, H), 



B R {x) := {z £ IT : \z - x\ < 



}■ 



R 



f : R d x R d x M + -)■ R d satisfies 



f(x,x,0) = f(x), 



\f(u,v,h) 
\f(u,v,h) 
\f(x,x,h) 



f{w, v,h)\ < L\u — w 
f(u, w,h)\ < L\v — w\ 
f(x, x, 0)| < Lh\i(x)\ 



(9) 



i : R d x R d 



—¥ R is a discrete gradient of I satisfying 

i(x, x) = i{x), 



\i{u, v) — i{w, v)\ < L\u — w\, 
\i(u,v) — i(u, w)\ < L\v — w\, 



(10) 
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—> R d satisfies 

i(x, x, 0) = i(x), 
\i(u,v,h) — i(w, v, h)\ < L\u — w\, 
\i(u, v,h) — i(u,w,h)\ < L\v — w\, 
\i(x, x, h) — i(x, x, 0)| < Lh\i(x)\, 

and similarly for i : R d x R d x R + -> R d and i : R d x R d x R + -> R d . Let C 2 be the 
constant defined after (5) and define 

R' := max{R, WL} and H' := min{H, {36C l +6)L }- 

Then for each x £ B and h 6 [0, H') there exists a unique x' € Bri(x) satisfying 
(6) where S is given by the formula (8). 

Proof. Note that if x £ B and i(x) — then x' — x is the unique solution to (6) 
in Bri(x) = {x}. For the case when i(x) ^ we will apply Banach's Fixed Point 
Theorem (Theorem 1.3) to prove our result. Let R' and H' be defined as in the 
theorem and for fixed x € B, such that i(x) ^ 0, define X :— Br/(x). X is a closed 
subset of R d , so together with the metric | • |, it is a complete metric space. Also 
fix h G [0, H'), and define T : X -> R d by 

T(z) :— x + hS(x, z, h)i(x, z) for all z£l, 

where S is given by (8). To satisfy the assumptions of Theorem 1.3 we must show 
that T(z) G X for all z G X and that T is a contraction on (X, | • |). 

Let z e X. Using (11), z G Bjy(a;) C S fl (x) (since R' > R), R' > 10L and 
h < j^j- we have 



\i(x,z,h)\ < \i(x) \ + \i(x, z,h) — i(x, x,h) \ + \i(x, x, h) — i(x, x, 0)| 
< (I + -L + Lh)\i(x)\ < l\i{x)\. 

We can derive similar inequalities for i and i, and for i we can derive 



(12) 



\i(x,z)\<^\i(x)\. (13) 

Using (9), (5), C 2 := C x + \, R' > 10 L and h < ^ we also get 

|/(i,«,/»)|<C 2 |t(a:)|. (14) 

Using (11) and (12) for i and i, z G Br>(x), R' > 10L, h < j^- and writing i 
instead of i(x) we get 

i(x, z, h) ■ i(x, z, h) =\i\ 2 + [(i(x, z, h) — i(x, x, h)) 

+ (i(x, x, h) — i(x, x, 0))] • i(x, z, h) 

+ i ■ [(i(x, z, h) — i(x, x, h)) + (i(x, x, h) — i(x, x, 0))] (15) 

>\if_ { m + Lm) i li{ _ m m + Lm 

>\A 2 -^-m 2 -m=m\ 2 >w\ 2 - 
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We get T(z) € X from the following inequality, where we have used (15), (14), (12), 
(13) and h < H' to get 



\T(z) — x\ = h\S(x, z, h)i(x, z)\ = h 



(»■»)/-(/■»)» 



< 



j^\f(x,z,h)\\i(x,z,h)\\i(x,z)\ 



< AhC 2 \ ■ m(x)\ = ^C 2 h\i{x)\ < 6C 2 h\i(x)\ < 



liMJ 



To show T is a contraction, let z, z' G X. Using (15), (11) and (12) for i and i, 
and writing i{x, z, h) as i(z) etc. we get 



i(z)-i(z) i(z')-i(z') 



i{z')-i(z') — i(z)-i(z) 

(i(z'yi(z'))(i(zyl(z)) 

< u^f^I* - Aim\ = - A < i$fr\z - A 

Using (9), (10), (11), (12), (13), (14), (15) and (16) we get 



(16) 



< 



i( z H( z ) 

i(^)-i( f ')[/( z )-/V)l 

»(z)-i(z) 

( ' 



i(^).[i( z )-i(^)]/( 2 ) 



^|i^( i l z -^lTCl^)l C »l*(*)l + IW»)l i N- 2 nC'a|»(ai)| 

+ f Wa t )|^|i(x)|i|z-*'|) + 1 ^|z-»'|||t(x)|iJ|t( a! )|C 2 |i( a! )| 

5 J 2 



r<»c 2 + «gui*-*'l 



<(18C2 + 3)L|z-z'|. 



Using a similar argument we can also derive 



i(z) 



i{z)-x(z) 

Now using (17), (18) and h < H' we get 
\T{z)-T{z')\<h 



W?)^ <(18C 2 + 3)L|,-A 



(17) 
(18) 



I( Z ).»( Z )/( Z )-/( Z )-I( Z )»( Z ) _ i( z ').i( z ')/( z ')-/V).i( z ')i( z ') 



< h 



'( Z H( Z ) 



/(*) 



»( Z 'H( Z ') 



/(*H(*) 

l(z)-i(z) 



i(z) 



% z ?Jj z ? ~i(z') 

i{z')-t{z>) v 7 



v ' i{z')-i{z')' 

< (36C 2 +6)Lh\z- z'\, 

where (36C 2 + 6)Lh < 1. Therefore, T is a contraction on (X, | • |) and by Theorem 
1.3 there exists a unique x' G X such that T(x') = x' . By the definition of T it 
follows that x' satisfies (6) where S is given by (8). □ 

3. Order of accuracy. In this section we give sufficient conditions for a discrete 
gradient method defined by (6) and (8) to be of order p for arbitrarily chosen peN. 
In addition to requiring the same conditions as in Theorem 2.1, we also require two 
further conditions. 

The following two lemmas will be used to prove our main result, Theorem 3.3. 
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Lemma 3.1. For a bounded set B C K d , let R, L, H, /, C2, R' and H' be defined as 
in Theorem 2.1. Then, for each fixed x G B and h G [0,-ff') there exists a unique 
y G Bqjii (x) such that 

y = x + hf{x,y,h). (19) 

Proof. Fix x e B and h € [0,H'), and define X := B 6R >(x) and T(z) := x + 
hf(x, z, h) for each z <E X. For z G X use (14) and h < 6C } R , to get 

\T(z)-x\=h\f(x > z > h)\<hC 2 \i(x)\ < i^i, 
so T(z) G X. For z' G X, use (9) and h < to get 

|T^)-T(^)| = /i|/>,^/i)-/(^^,/i)| <£ft|z-z'| < Tsk-^'l- 

Hence T : X — > X is a contraction and the result follows by applying Theorem 1.3. 
For the case when i(x) = note that f(x, z, h) = for z 6 X = {x} (use (9) and 
|i(a:)|=0). □ 

Lemma 3.2. In addition to the hypotheses of Theorem 2.1 suppose that for each 
x 6 B and all u,v G B§ri(x), f satisfies 

\f(u)-f(v)\<L\u-v\. 

Let x(-) denote the exact solution to (1) with x(t) = x for some t G R+ and 
h G [0, H'), then x(s) exists and satisfies x(s) G B^n> (x) for all s G [f , £ + ft] . 

Proof. If i(cc) = then (by (5)) we are at a stationary point and the result is trivial. 
Suppose i(x) ^ 0. Existence theory for ODEs (see e.g. [5, Thm. 1.7.3 on p. 37]) 
implies there exists a T > t such that x(s) G B^r* (x) for all s G [t,T]. If T > t + ft 
then we are done, so suppose T < t + ft. Existence theory also implies that the 
solution x(s) exists for s G [T, T'] for some T" > T, even though it may not be in 
B 5R .(x). 

For each s G [£, T] we have 

|/(x( S ))| < |/(z)| + L\x(s) - x(t)\ < \f(x)\ + J S \f(x(r))\dr, 

so by the Gronwall Inequality (see e.g. [6, Thm. 1.1 on p. 24]), (5), ft < H' and 
i{x) ^ we have 

\f(x(s))\ < \f(x)\e Lh < CMx)\e 1/W < 
Therefore, for each s G [t,T], 

\x(s) - x\ < J S \f(x{r))\dr < gglMMl < ^h\^)\ < J^]_ 

Since this inequality is strict and x(-) exists up to T' and is continuous, there exists 
an e > such that x(r) G Bsri(x) for all r G [£, s + e]. 

To complete the proof let us argue by contradiction. Suppose there exists a 
s G [£,£ + ft] such that x(s) ^ B 5 ri(x) (this includes the case when x(t + ft) does 
not exist). Then by continuity of x(-), x{t) = x and since B^r'(x) is closed, there 
exists a s' G [£, s) and a S > such that x(r) G B^ri(x) for all r G [£, s'] and 
x(r) ^ B§ri(x) for all r G (s',s' + 5). However, by the above argument there exists 
an e > such that x{r) G B$ri(x) for all r G [£, s' + e). A contradiction. Therefore, 
x(s) G S 5fl/ (a:) for all s G [£, £ + ft] . □ 
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The extra Lipschitz continuity condition on / in Lemma 3.2 follows from our 
earlier assumption that / is locally Lipschitz. 

Now let us present the main theorem of this section, where we show that under 
certain conditions the discrete gradient method defined by (6) and (8) is of order 
p, for some p £ N. 

Theorem 3.3. For a compact set B C M. d , let R, L, H , f , i, i, i i, C 2 , R' and 
H' be defined as in Theorem 2. 1 and let f satisfy the Lipschitz condition in Lemma 
3.2. 

For each x £ B and h £ [0, H') 

1. let x' £ Br>(x) be the unique solution to (6) with S defined by (8) (which 
exists by Theorem 2. 1 ), 

2. let y £ B§ri{x) be the unique solution to (19) (which exists by Lemma 3.1), 
and 

3. let x(-) denote the exact solution to (1) satisfying x{t) — x for some t £ K+ 
(which exists on [t,t + h] by Lemma 3.2). 

Also suppose that 

4- f is such that the method defined by (19) is of order p for some p £ N, i.e. 
there exist positive constants C3 and H3 < H' such that 

\y - x(t + h)\ < C 3 h p+1 , for all h e[0,H 3 } and allx £ B, (20) 

and 

5. there exists a positive constant C4 such that for each x £ B and all h £ [0, H3], 
\l(x,x',h)-i(x,x',h)-i(x,x',h)-i(x,x')\ < C 4 (\x' - x(t + h)\ + h p+1 ) \i(x)\. (21) 

Then the discrete gradient method defined by (6) with S given by (8) is also of order 
p, so that there exist positive constants C5 and H5 such that 

\x' - x(t + h)\ < C 5 h p+1 , for all h £ [0, H 5 ] and all x £ B. 

Proof. Define < H 5 < min{ff 3 , 3^^} < H' and C 5 = |g + Fix x £ B 

and h £ [0,H 5 ]. The first step in the proof is to bound \hf(x,x',h) ■ i(x,x')\. We 
get 

\hf(x,x',h) -i(x,x')\ <\hf(x,y,h) ■ i(x, x(t + h))\ 

+ \hf{x, y, h) ■ [i(x, x') - i(x, x(t + h)] | 

+ \h[f(x,x',h) - f{x,y,h)] ■ i(x,x')\ 
= :T l +T 2 + T 5 . 

Now bound each Ti separately. Using (x(t + h) — x) ■ i(x, x(t + h)) = I(x(t + h)) — 
I{x) = 0, (20) and (13) (with z = x(t + h) £ B 5R >(x) C B R ,{x)) we get 

T\ :=\hf(x,y,h) ■ i(x,x(t + h))\ = \(y - x) ■ i(x,x(t + h))\ 

=\(y-x(t + h))-i(x,x(t + h))\ (23) 

<^C 3 hP +1 \t(x)\. 

Using (14) (with z = y £ B ew (x) C B R ,(x)), (10) and h < H' < we get 

T-2 :=\hf(x,y, h) ■ [i{x,x') - i(x,x(t + h)]\ 

<hC 2 \i{x)\L\x' - x(t + h)\ (24) 
<Mi{x)\\x' - x(t + h)\. 
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And using (9), (13) (with z = x' G B R ,(x)) and h < H' < we get 

T 3 ■= \h[f(x, x', h) - f(x, y, h)] ■ i(x, x')\ 
<hL\x'-y\^\i(x)\ 
\i{x)\\x' - y\. 



(25) 



< 



100 



Putting (22) together with (23), (24) and (25) we get 

\hf(x,x',h)-i(x,x')\ < (^C 3 hv +1 + ±\x'-x(t + h)\ + ^\x'-y\) \i(x)\. (26) 
Now consider bounding \x' — y\. We get 



< h 



sr. , , , i{x 7 x' .h)-i(x.x') ~ 
i{x,x,h)*i{x,x,h) 



,h)-i(x ,x ,h) 

hf(x. i x\h)-i(x,x') 



i(x.x\h) 



i(x,x' ,h)-i(x ,x' ,h) 



(27) 



=:T 4 + T 5 . 



Now bound T 4 and T 5 separately. Using (9), (14), (15), (12), (13) (with z = x' £ 
B R ,(x) and z = y e B R ,(x)), (21) and h < mm{H 3 , j^, 36 c 2L } we get 

< hUf(x,x%h)-h*,y,h)) i l(, °' , °'- h)ri(x -' : ' ) 



< hL\x' - y\ 



(x,x ,h)-i(x,x ,h) 

i(x,x' ,h)-i(x,x ) 
i{x,x' ,h)'i(x,x' ,h) 



\+h\f(x,y,h) 



+ hC 2 \i(x)\ 



i(x,x' ,h)-i(x,x') 
i(x,x' ,h)-i(x,x' ,h) 



<2hL\x'-y\§-£ 



+ 2C 2 C 4 h{\x' - x{t + h)\ + h p+1 ) 
<jo\x'-y\ + 2C 2 C A h\x' -x(t + h)\ + ^hP +1 . 
And using (15) (with z — x'), (12) and (26) we get 



< 



~t I i \ hf(x,x ,h)-i(x,x ) 

l ( X > X ' "-) ■/ h \ '( h \ 
■ ' z{x,x ,ti)-i{x,x ,ri) 

5$fl\hf(x,x',h) -i(x,x')\ 



< i§C 3 hP +1 + ±\x'- x(t + h)\ + ^\x'- 

< 3C 3 h p+1 + ±\x' - x(t + ft) | + - y\ 



v\ 



151^ "\" 1 "71 1 io I 

Putting (27) together with (28) and (29) we get 

W -y\<l\x'-y\ + (2C 2 C 3 h + ±) \x' + x(t + h)\ + + 3C 3 ) h?+\ 
and hence 

W -y\< (5C 2 C 4 h + ±) \x' + x(t + h)\ + + ^) hP +1 . 



Our result now follows easily from (30) and (20) since h < min{iJ3, 



30C2C4 



Therefore, 



x(t + h)\ < \x' - y\ + \y - x(t + h)\ 

< {hC 2 C 4 h + i) \x' + x(t + h)\ + + i^) h p+1 
<^-x(t + h)\ + (^ + ^)h^\ 

<t + h)\<l($t + ^) h p+1 = (§& + *fa) h p+1 = C 5 h p+ \ 



(28) 



(29) 



(30) 



□ 
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4. Discrete gradient methods for computation. In this section we consider 
discrete gradient methods that may be used in computations, and how we might 
choose /, i, i, i and i so that they satisfy the hypotheses of Theorems 2.1 and 3.3. 
We also consider how to make the nonlinear system of equations defined by (6) and 
(8) as easy as possible to solve at each time step. In the case when I is quadratic we 
achieve this by constructing a discrete gradient method so that it is linearly implicit 
in x' . In general this is not possible, except in the case when / is quadratic. 

Since f(x) and i(x) are locally Lipschitz continuous, for a given bounded set 
B C R d and constant Rq > there exists a constant Lq > such that for all x G B 
and for all u, v G Br (x) 

\f(u)-f(v)\<L \u-v\, 
\i(u) — i(v)\ < Lq\u — v\. 

A possible choice for / is so that the method defined by (19) is a Runge-Kutta 
method. The following definition of a Runge-Kutta method has been modified from 
[4, chap. II] for the case of autonomous ODEs. 

Definition 4.1. Let 6j, ay — 1, . . . , s) be real numbers and let ft. > be the 
time step. One step of an s-stage Runge-Kutta method defining a map for 
approximating the solution to (1) is defined by 

h = f ^x + h ajjkj^J i = l,...,s, (32) 

s 

x' = x + h bjkj. (33) 

i=l 

For / in (19) to correspond to an s-stage Runge-Kutta method then we must 
define 

f = f(x,h):=J2 b i k i> ( 34 ) 

i=l 

where the hi are (implicitly) defined by (32). Even though the hi may be implicitly 
defined by (32), as for Implicit Runge-Kutta methods, we may use the map K, : 
R d x M + — !> W dxs (defined below in Lemma 4.2) to explicitly represent each k{ in 
terms of x and h as the i th column of IC(x,h). Since the ki do not depend on x' 
we get a / that does not depend on x' and we may write / = f(x, h) instead of 
f = f(x,x',h). 

For a given s-stage Runge-Kutta method two constants that will be useful are 

s s 

A\ := max \a^\ and A2 :— \bi\. 

3 = \ i=l 

For completeness, the following lemma gives conditions on h so that an s-stage 
Runge-Kutta method is well-defined in the sense that the map x 1— > {ki : i = 
1, . . . , s} is locally well-defined (so that the map x H> x' is also locally well-defined). 
Although it is a bespoke result for this paper, the proof is very similar to that given 
for [5, Thm. II.7.2]. 

Lemma 4.2. Let B C M. d be a bounded set and let Ro > be a constant. Let Lq 
be the constant from (31), and let C\ be the constant from (5). Define 

H := mm{ r ^-, 2 a 1 (Ci+l /-Ro)Bo ' 2L a 1 (c 1 +l / r )r } ■ 



DISCRETE GRADIENT METHODS 



11 



Then for each x G B, h G [0, H ) and u G B 2 r (x) there exists a unique K 
[hk 2 - ■ ■ k s ] G {[m x m T ■ ■ m s ] G M rfX;i : K - < MMi} satisfying 




Let us define a map K, 



pdxs 



i = l,...,s. 



such that K = /C(u, h). 



(35) 



Proof. We again apply Banach's Fixed Point Theorem (Theorem 1.3). Fix x G B, 
h G [0,H) and u G B 2Ro (x). For if = [fcxfo- • • Jfe s ] G M dxs let ||if|| = max, |fc,| and 
define X := {[m x m 2 - • • m s ] G M dX5 ' : |m, - < L °^ a;)l } and T : X ^ R dxs by 



T(#) := [life-../. 



where i, := / u + h aijfcj 



for each if = [fcife' • • fc s ] G X. To apply Theorem 1.3 we must show that T[K) G X 
for all if G X, and that T : X — > X is a contraction. 

Let K = [k 1 k 2 ---k s ],K l = [k[k' 2 - ■ ■ k' s ] G X, T(K) = [hh-'-ls] and T(K') = 
[l[l 2 - "I'sl- First note that by the definition of X and using (5) we have 

\\K\\ < max(|/(x)| + fa - f(x)\) < \f(x)\ + ^feil < ( Cl + j£) \i(x)\. (36) 
Using this, and since h < 2A 1 (c 1 +l /r )r an< ^ 11 e B%Ro( x )> we have 



/; ^ a, ,k, - 

3=1 



< |u - x\ + ftAill^ll < Jgji + ^ (Ci + |t-(a:)| < 



Hence u + hJ2j=i a ijkj G B Ro (x). Using (31) and (37) we then get 

\h-f{x)\ <L 
Hence T(K) G X. By (31) we also have 



KM I 
(37) 



h'y^a i jkj 

3=1 



^ L \i(x)\ 
- Ro ■ 



< hL 



3=1 



< HLqAAK - K'\ 



LnAi 



T : X 



X is a contraction 
□ 



so \\T{K)-T(K')\\ < hL Ai\\K-K'\\. Since h < 
and the result then follows by applying Theorem 1.3. 

The following lemma is a technical result for the subsequent corollary. 

Lemma 4.3. Let f(x,h) be defined by (34), where the ki are defined by an s-stage 
Runge-Kutta method with 53j=i bi — 1. Let B C M. d be a bounded set and Rq > a 
constant. Let Lq be the constant from (31) and Ci be the constant from (5). 

Then f{x, 0) = f(x) for every x G B, and there exist positive constants R, L and 
hi such that for each x G B, and for all u, v G Br(x) and h G [0, H) 

\f(u,h)-f(v,h)\<L\u-v\, 

\f(x,h)-f(x,0)\<Lh\i(x)\. 
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Proof. Define R := 2Rq, L := L A 2 max{2, A^d + jg)} and 



H := min{ 



Ro' 

1 1 1 

2L„A 1 > 2A 1 (C 1 +L /R )R > 2L A 1 (C 1 +L / R )R 



h 



and fix x 6 B. If h — 0, then ki = ki{x, 0) = f{x) for all i, and using Y]^_ -, = 1 
we get f(x,0) = J2Ui hf{x) = f(x). 

For h £ (0, if) and b,«6 Br(x), we have from Lemma 4.2 that there exist unique 
K u = K.(u,h) and K v = K(v,h) in {[m 1 m 2 ---m s ] £ R dxs : |m» - f(x)\ < ^^f^} 
satisfying (35). Using (35), (31) and h < 2 lIa ± > we nave 

\\K U - K v \\ < L \u - v\ + hL Q AAK u 



K,, 



<L Q \u-v\ + \\\K u -K v \ 



(38) 



Hence, \\K U — K v \\ < 2L \u — v\. Using this, (34) and A 2 :— J2t=i 1^1 we § e * 

\f(u,h)-f(v,h)\<A 2 \\K u -K v \\ <2L A 2 \u-v\<L\u-v\. 

Finally, using (34), ^i=i ^ = 1j (31); (36) and writing ki(x,h) for the i th column 
of /C(x, h), 



\f(x,h)-f(x,0)\ = 



&i/ci(x, /;.) - ^/(x) 
i 



< A2 max \ ki(x, h) — f(x)\ 



A 2 max 



/ I x + h ay fcj (a;, ft,) 



< A 2 L /iAi||i<r|| < AiLahAx (d 



Ln. 

Ro 



/(*) 
i{x)\ < Lh\i(x)\. 



□ 



Corollary 4.4. All consistent Runge-Kutta methods (so that 5Z* =1 b{ — 1) define 
a f (see (34) ) that satisfies condition (9) in Theorem 2.1. 

If we define / corresponding to an explicit s-stage Runge-Kutta method (where 
fly = for i < j), then the ki in (32) are defined explicitly in terms of x and h and 
f(x, h) may be calculated explicitly (instead of using the map /C which may not be 
computed explicitly). To obtain a Runge-Kutta method that is of order p there are 
additional constraints on the and hi (e.g. see [4, p. 29]). 

To check whether or not i = i(x, x') satisfies (10) we only need to ensure that it is 
locally Lipschitz continuous since i(x, x) — i(x) is already satisfied by the definition 
of a discrete gradient. Moreover, since i{x) is locally Lipschitz, it easily follows that 
both the coordinate increment discrete gradient (see [8]) and the one used in the 
average vector field method (see e.g. [9, 14]) are also locally Lipschitz. For general I 
there are no known explicitly defined discrete gradients. However, if / is quadratic 
then i(x) is linear and we may define i{x, x') so that it is linear in x' by taking 



i(x, x') := i (^-) = |(i(x) + i(x')). 



(39) 



There is considerable freedom over how we choose i, i and i in (8), and to apply 
Theorems 2.1 and 3.3 they only need to satisfy (11) and (21). For example, we may 
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define i to be any of the following (and similarly for i and i): 



i(x) 


= 




i(x') 


= i(x'), 




i(x, x') 


= \m- 


H(z')) 


i(x, x ) 




)• 


i(x, x') 


= i(x,x'). 




i(x, h) 


= i(y) 


where 



Except for the final case when i(x, h) = i(y), it is obvious that all of these choices for 
i satisfy (11) since i(x) is locally Lipschitz. To confirm that i(x,h) — i(y) satisfies 
(11) we prove the following two lemmas. 

Lemma 4.5. Let B C R d be a bounded set and suppose there exist positive constants 
R, L and H such that for each x £ B, and for all u,v,w G Br{x) and h G [0, H), 
f : R d x R d x R + -> R d satisfies (9). Let d be the constant from (5). 

Then for each x G B , u G B2r(x) and h < min{iJ, j^, ^c 1 +2L/r)r ^ l ) there 
exists a unique y G Br(x) satisfying 

y = u + hf(u,y,h). 

Moreover, if u — x then y G B2r{x). 

Proof. Fix x G B, u G B 2 r(x) and h < min{iJ, (c 1 +2L/r)r ' T^ - Define X := 
B R (x) and T : X ->• R d by T(z) := u + hf(u, z, h) for each z G X. Let z, z' G X. 
Using (9), (5), u G B 2 r(x), z G Br(x) and ft < {Ci +2L/r)r} we S et 

|T(z) — x\ < \u — x\ + h\f(u, z, h)\ 

<\u-x\+ h(\f(x)\ + L\u - x\ + L\z -x\ + Lh\i(x)\) 



< 



2R 



so T{z) G X. From (9) we also get \T(z) - T{z')\ = h\f{u,z,h) - f(u,z',h)\ < 
hL\z — z'\. Since h < j-, T : X —> X is a contraction and the first part of the result 
then follows by applying Theorem 1.3. 

If u = x then repeating this argument with X := B 2 r(x) yields y G B2r(x). □ 

Lemma 4.6. Let B C R d be a bounded set and let Rq be a positive constant. Let Lq 
be the constant from (31), and let C± be the constant from (5). Suppose there exist 
positive constants R, L and H such that for each x G B, and all u,v,w G Br(x) 
and h G [0, H ) that f : R d x R d x R + -> R d satisfies (9). Define R x := 2R, L x := 
max{L,i ,io(Ci + ^)} 7 H\ := min{iJ, ( Ci +2L/r)R ' Jl)' andi : K d xM + ^ M d 
such that 

i(u, h) := i(y) where y satisfies y = u + hf(u, y, h) 
for all x G B, u G Br^x) and h G [0, H\). 

Then for each x G B, and for all u, v G Br 1 (x) and h G [0, -Hi), i satisfies 

i(x, 0) = i(x), 

\i(u, h) — i(v, h)\ < Li\u — v\, 

\i(x, h) — i(x, 0) < Lih\i(x)\. 
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Proof. Fix x £ B, u,v £ -B_r x (x) and h £ [0, Hi). Let y = y(x, ft) denote the solution 
to y = x + hf(x,y,h), and similarly for y(u,h) and y(v,h). From Lemma 4.5 we 
know these solutions exist and that y(x,h) £ B 2 r{x) and y(u,h),y(v,h) £ Br(x). 
If ft = then y = x and i(x, 0) = i(x). 

If h ^ 0, since ft), t/(u, ft) £ Br(x), we may use (9) and ft. < ^ to get 

|j/(u, ft) - h)\ = h\f(u, y(u, ft), ft) - f(v, y(v, ft), h)\ 
< hL\u — v\ + hL\y(u, ft) — y(v, h)\ 
<%\u-v\ + \\y(u,h) -y(v,h)\, 

and so \y(u, ft) — y(v, ft)| < \u — v\. Using this and (31) we get 

\i(u,h)-i(v,h)\ = \i(y(u,h))-i(y(v,h))\<L \y(u,h)~y(v,h)\<L \u-v\<Li\u-v\. 
Using (31), (9), (5), y(x,h) £ B 2 r(x) and ft < ^ we also get 
|i(x, ft) — i(x, 0)| = \i(y(x, ft)) — i(x)\ 

< io|?/(x, ft) - x| = L h\f(x, y(x, ft), ft) 

< L ft(|/(x)| + L|y(x, ft) - x| + Lh\i{x)\) 

< L h(d + 27? + 27?) |«(x)| 

< Lih\i(x)\. 

□ 

In Theorem 3.3 we also require that i, i, i and i satisfy condition (21). By taking 
i(x,x',h) := i(x,x',h) and «(x,x',ft) := i(x, x') then this is achieved trivially, and 
the resulting method is equivalent to a projection method (see [10]). Unfortunately, 
in this case the system of equations to solve at each time step is nonlinear in general. 

A method that is almost a projection method is the following. For general i, i 
and / satisfying the conditions in Theorem 2.1 and Theorem 3.3, define i(x, x', ft) := 
i(x, x 1 , ft) and i(x, ft) := i(x, y{x, ft)) where y = y(x, ft) satisfies y = x + ft/(x, y, ft). 
The following two lemmas ensure that i satisfies (11) and i, i, i and i satisfy (21). 

Lemma 4.7. Let B C R rf be a bounded set and let C\ be the constant from (5). 
Suppose there exist positive constants R, L and H such that for each x £ B, and 
all u,v,w £ B R (x) and ft 6 [0,H) that f : R d x R d x R+ -> R d satisfies (9), and 
i : R d x R d -» R d is a discrete gradient of I satisfying (10). Define R\ := 2R, 
L\ := max{2L,i(Ci + and Hi := min{ff, ^r, ^(d+L/fl) ' Tl,}> an ^ a ^ so define 
i : R d x R + R d sttcft iftai 

i(u, ft) := i(u, y) where y satisfies y = u+ hf(u, y, ft), 

for all x £ B, u £ B^ix) and ft £ [0, Hi). 

Then for each x £ B, and for all u,v £ B^ (x) and ft £ [0, Hi), i satisfies 

i(x, 0) = i(x), 

\i(u, ft) — i(v, ft) < Li|u — w|, 

|i(x, ft) — i(x, 0)| < Lift|i(x)|. 

The proof of this result is very similar to the proof of Lemma 4.6 so we omit it. 
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Lemma 4.8. Let B C M. d be a compact set and suppose that R 1 L, H, f, i and i all 
satisfy the conditions of Theorem 2.1. Define i = i. Define R\, L\, H\ and i as in 
Lemma 4-7, and define 

R[ := max{R u lOLi} and H[ := mm{Hi, jfc, (36Ca 1 +6)£l }■ 

Then f, i, i, i, i R[ and H[ satisfy the conditions in Theorem 2.1 with R, L, H , 
R' and H' replaced by R\, L\, Hi, R[ and H[ respectively. 
Moreover, for each x G B and h G [0, H[) 

1. let y G Bqri (x) be the unique solution to (19) (that exists by Lemma 3.1), 

2. let x' G Br/ (x) be the unique solution to (6) with S defined by (8) (that exists 
by Theorem 2.1), and 

3. let x(-) denote the exact solution to (1) satisfying x(t) for some t G K+. 
Also suppose that 

4- f is such that the method defined by (19) is of order p for some p G N, i.e. 
there exist constants C3 and H3 < H' x such that 

\y -x(t + h)\< C 3 h p+1 , for all h G [0, H 3 ] and all x G B. 

If we define C4 :— ^f 1 max{l, C3}, then i, i, i and i satisfy (21). 



Proof. The fact that /, i, i, i, i R[ and H[ satisfy the conditions in Theorem 2.1 
with R, L, H, R' and H 1 replaced by i? l5 L-y, H\, R[ and H[ respectively, follows 
from i?i > R, L x > L and H x < H. 

Using (12) and (10) (with L replaced by Li) we get 

\i(x, x' , h)-i(x, x' , h) — i(x, x', h)-i(x, x')\ — \i(x, x', h) ■ [i{x, y) — i(x, x')}\ 

<§|i( a; )|L 1 | a /-y| 

< «|i.|i(a;)|(|x , -a;(t + ^)| + \y-x(t + h)\) 

< ^ (\x' - x(t + h)\ + C 3 hP +1 ) \i(x)\ 

< Ci (\x' - x(t + h)\ + h p+1 ) \i(x)\. 

□ 

This lemma leads to the following obvious corollary of Theorem 3.3. 

Corollary 4.9. With the same hypotheses as Lemma 4-8, then the discrete gradient 
method defined by (6) and (8) is of order p. 

If / is quadratic then using (39) to define i and an explicit s-stage Runge-Kutta 
method to define / we can construct a linearly implicit discrete gradient method. 
The following corollary is a direct consequence of (39), Lemmas 4.3, 4.5, 4.7 and 
4.8 and Theorems 2.1 and 3.3. 

Corollary 4.10. Suppose I is quadratic and let f correspond to an explicit s-stage 
Runge-Kutta method of order p, for some p G N. Then the discrete gradient method 
defined by 

, _ ix + ±S{x,h)i(x) + §S(x,h)i(x') ifi(x)^0, 
\x ifi(x)=0, 



where 



c( u\ f{ x , h )i{x) T -i{x)f{x,h) T 
b(x,n) := — = 

i{x)-i(x + y( x ,h)) 
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is linearly implicit in x' , locally well-defined (in the sense that for sufficiently small 
h there exists a locally unique x' at each time step) and of order p. 

5. Numerical examples. In this section we experiment with using the new lin- 
early implicit (when / is quadratic) discrete gradient method constructed in Corol- 
lary 4.10. To demonstrate the efficiency gain due to only needing to solve a linear 
system at each time step we will compare it with the standard projection method 
from [4] on a problem with quadratic /. 

The new discrete gradient method we suggested in Corollary 4.10 for the case 
when / is quadratic corresponds to defining i(x, x\ h) = i(x, x\ h) := i{x), i(x, x') = 
^(i(x) + i(x')) and i(x,x',h) :— i(x,y) where y satisfies y = x + hf(x, h) and / is 
defined by an explicit s-stage Runge-Kutta method. With these choices for /, i, i, 
i and i then the discrete gradient method defined by (6) and (8) becomes the one 
defined in Corollary 4.10. 

In our experiments below we use the classical explicit 4 th order Runge-Kutta 
(RK4) method to define /. It is defined by the Butcher tableau (see e.g. [4, p. 30]): 



•k 








•k 


1 

2 






•k 





1 

2 




■k 








1 




1 


1 


1 1 




6 


3 


3 6 



The entries denoted by * are not required because we are only considering au- 
tonomous ODEs. 

Since / is quadratic, i(x) is linear and there exists a matrix M and a vector b 
such that i(x) — Mx + b for all x G K d . In the case when i(x) ^ 0, to obtain x' at 
each time step of the method in Corollary 4.10, we must solve the linear system 

(Id - %S(x, h)M)x' = (Id + %S(x, h)M)x + hS(x, h)b, (40) 

where Id £ M. d is the identity matrix. Note that the cost of computing f(x,h) at 
each time step is essentially the same as the cost for computing the RK4 method, so 
we already know that computing this new discrete gradient method will cost more 
than the RK4 method. 

To compare this new linearly implicit discrete gradient method with another 
integral preserving method we also consider the standard projection method (see 
Algorithm IV. 4. 2 in [4]) with RK4 as the underlying method. The algorithm is: 

1. Compute y by solving y = hf(x, h) (where / corresponds to the RK4 method), 

2. Compute x' by solving x' = y + Xi(y) such that I(x') — I(x) for x' and A € M.. 

Actually, step 2 above is what is suggested in equation (4.5) of [4], after Algorithm 
IV. 4. 2, as a more convenient nonlinear system to solve (by reducing the number 
of evaluations of i(-) required). To solve the nonlinear system in step 2 Hairer, 
Lubich and Wanner use the following simplified Newton iteration (see [4, p. Ill] 
for details) 

x n - h x x I(v + Ki(y))-I(x) 
A = and A i+ i = A 4 — — — . 

i{y) ■ ny) 

Once this iterative scheme has converged to A* then x' is computed using x' = 
x + X*i(y). 
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5.1. Modified rigid body motion. In the following example we will compare 
the performance of our new discrete gradient method with the RK4 method and 
another integral preserving method, the standard projection method (all described 
above). We first demonstrate the benefits of preserving the integral by inspecting 
phase space plots for the RK4 method and our new discrete gradient method. We 
will see that the integral preserving method does a much better job of following 
the trajectory of the exact solution. To compare the errors we include all three 
methods. We will see that the errors for all three methods are of similar size, that 
all three methods are of the same order, and that our new discrete gradient method 
is more efficient than the standard projection method. 

The example we use for our computations is a modification to the equations for 
rigid body motion in three dimensions (see e.g. Example 1.7 in [4, p. 99]). For 
a parameter a £ R, the augmented equations of motion for a body with centre of 
mass at the origin are 

-X3 x 2 - c 

x 3 -X! x 2 /h , (41) 
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where Ii, I2 and I3 are also parameters. In the case when a = this system 
reduces to the equations for rigid body motion where the vector x = (xi, X2, xz) T is 
the angular momentum in the body frame and the Ii parameters are the principal 
moments of inertia. Moreover, when a = there are two quadratic first integrals, 
but in the general case when a/0 then the only first integral is, 



I(x) 



x 2 



b-i Jo n Jur> 

— + — + — 

Ji h h. 

Notice that (41) has the form of (3). 

In our computations we have taken 1\ = 2, I2 = 1, I3 = 2/3, a = 1 and we have 
used the initial condition xo — (cos(l.l), 0, sin(l.l)) T at t = 0. Except for a, these 
are the same values used in [4] . 

In Figure 1 we see that phase space, projected onto the rcia^-plane, is more ac- 
curately represented when we compute the solution using our new discrete gradient 
method instead of the RK4 method. Here we have used two different time steps 
(h = 0.5 and h — and computed up to a final time of t = 500. In the plots, 
the solid grey line is the exact solution and the black dots are the approximate 
solution at each time step using either RK4 or our new discrete gradient method. 
For the larger step size of h = ^ the RK4 method appears to converge to equi- 
librium which is the wrong type of asymptotic behaviour. For our new discrete 
gradient method, while the errors are clearly quite large for this larger step size, 
the solution appears to be circulating around a periodic orbit which is the correct 
asymptotic behaviour for this example. Another possibility with the RK4 method 
(not observed in this example) is that the solution will blow up at some critical 
time (for example with a = 2, /i=|^atf~ 100). This cannot happen for integral 
preserving methods such as our new discrete gradient method. 

In Figures 2 and 3 we compare the errors for the three different methods: RK4, 
the standard projection method, and our new discrete gradient method. In Figure 
2 we have plotted the solution error and integral error versus time for the three 
different methods. We see that the solution error is initially similar for all three 
methods, it grows as time increases, and then remains bounded. The integral error 
plot clearly shows that the integral preserving methods preserve the integral up to 
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Figure 1. Phase space projected onto the xia^-plane of modified 
rigid body motion comparing the performance of the RK4 method 
and our new discrete gradient method, computing the solution for 
t e [0, 500] with h = 0.5 (top) and h = ^ (bottom). The solid grey 
line is the exact solution and the black dots are the approximate 
solution at each time step. 



double machine precision and are vastly superior in terms of preserving the integral 
than the non-integral preserving RK4 method. These computations used a fixed 
time step of h = 0.5 for all three methods and computations were performed up to 
a final time of t = 500. 

In Figure 3 we compare the performance of the same three methods for different 
step sizes. We are interested to see whether or not our new discrete gradient method 
is of the same order as RK4 (order 4), and to compare the efficiency of our new 
discrete gradient method with another integral preserving method, the standard 
projection method, where a nonlinear system of equations must be solved at each 
time step. By plotting the solution error at time t = 100 for different step sizes 
(h G [10 _3 5 ,1]) in the left plot of Figure 3, we confirm that our new discrete 
gradient method is of order 4, the same as RK4 which is the underlying method 
defining /. In the right plot of Figure 3 for the same range of step sizes we have 
plotted the solution error at t = 100 against the CPU time required to compute the 
solution up to t — 100. In this way we can compare the efficiency of these methods. 
The plot clearly shows that our new discrete gradient method is more efficient than 



DISCRETE GRADIENT METHODS 19 




Figure 2. Solution error and integral error vs. time for the RK4 
method, the standard projection method and our new discrete gra- 
dient method, h — 0.5. 




Figure 3. Order and efficiency of our new discrete gradient 
method compared with the RK4 method and the standard pro- 
jection method. 



the standard projection method for this problem because it yields smaller errors 
using less computational effort. The plot also shows that the RK4 method is more 
efficient again. Since both integral preserving methods effectively compute the 
RK4 approximation within their methods the computational cost required by these 
methods is more than RK4. Moreover, in the left plot of Figure 3 we saw that the 
size of the error for all three methods is similar. For these reasons RK4 is the most 
efficient method, however, RK4 does not preserve the integral and over longer time 
intervals it often has the wrong asymptotic behaviour. 

Also notice in Figure 3 (right) that the difference in efficiency between our new 
discrete gradient method and the standard projection method is more pronounced 
for larger time steps. This is probably due to the fact that the initial guess (the 
RK4 solution) in the Newton iteration for calculating the projection step in the 
standard projection method is more accurate for smaller time steps, resulting in 
fewer iterations until the convergence test is satisfied. 
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5.2. A time step criterion. A key feature of the existence and order of accuracy 
results in this paper is the fact that we may take x as close as we like to a critical 
point of I without any additional constraints on the time step. By considering 
different x values, and computing a single time step to get x' for different time 
steps we can show that this feature of our results is illustrated in the modified rigid 
body motion example. As criteria for a valid time step we consider the denominator 
of S (which should be positive) and the condition number (ratio between the largest 
and smallest eigenvalues) of the matrix Id — ^SM (see (40)). The initial points we 
consider are x = (R cos(l.l), 0, iZsin(l, 1)) T for R = 1 (this is the initial condition 
used in our earlier simulations and is far away from a critical point of /), respectively 
R = 0.1 and R = 0.01 (which is near to the critical point (0, 0, 0) T of I). 

In Figure 4 we have plotted condition number of Id — ^SM, the denominator of 
S and the error after a single time step vs. time step, for different starting x (R = 1, 
R = 0.1 and R = 0.01). We see that as the time step is increased there seem to be 
critical values where the condition number blows up, the denominator veers down 
to zero, and the error no longer behaves with the same asymptotic behaviour with 
respect to the time step. We see that for x close to the critical point the largest 
allowable time step actually increases. This is consistent with our theory. 

6. Conclusion. In this paper we have analysed discrete gradient methods from 
first principles. We have established the bare essentials in terms of local Lipschitz 
continuity conditions and other criteria to ensure that these types of methods are 
locally well-defined and are of order p. A key feature of our analysis is that we have 
removed any dependence of the time step on the distance to critical points of the 
preserved integral and all of the constants in our results are independent of \i(x)\. 

Although we have been careful to trace the value of constants through our proofs 
we do not make the claim that our constants are optimal. The reasons for this are 
that we have assumed that the same constants R and L can be used in all of the 
inequalities in (9), (10) and (11), and to simplify the presentation we sometimes 
used inequalities that were not completely sharp. If we had more precise knowledge 
of the optimal constants for which (5), (9), (10), (11), (20) and (21) hold, then we 
could repeat the arguments in the proofs of Theorems 2.1 and 3.3 to obtain better 
constants R' and H' in Theorem 2.1, and C5 and H§ in Theorem 3.3. 

As well as considering theoretical conditions for these methods we also developed 
results that will be useful for users of these methods for solving ODEs. We have 
shown how Runge-Kutta methods can easily be used inside the framework of dis- 
crete gradient methods and we have also developed a new method that is linearly 
implicit when the integral to be preserved is quadratic, and of order p for arbitrarily 
chosen p G N. Our numerical experiments confirmed that, in this case, solving a 
linear system at each step instead of a nonlinear system led to significantly reduced 
computational cost. 

The results in this paper can be easily applied to projection methods, see [10], and 
further avenues for research include developing similar theory for discrete gradient 
methods applied to ODEs with Lyapunov functions, and discrete gradient methods 
applied to stiff ODEs, an issue not addressed here. 
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Figure 4. Computations for a single time step of our new discrete 
gradient method applied to the modified rigid body motion example 
(41), for different starting points which depend on R (see text). 
Top left: plot of condition number of Id — ^SM vs. time step; top 
right: plot of the value of the denominator of S vs. time step; and 
bottom: plot of the error vs. time step. 
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